Eventually-stationary policies for Markov decision models with non-constant discounting

نویسندگان

  • Yair Carmon
  • Adam Shwartz
چکیده

We investigate the existance of simple policies in finite discounted cost Markov Decision Processes, when the discount factor is not constant. We introduce a class called “exponentially representable” discount functions. Within this class we prove existence of optimal policies which are eventually stationary—from some time N onward, and provide an algorithm for their computation. Outside this class, optimal policies with this structure in general do not exist.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Markov decision processes with exponentially representable discounting

We generalize the geometric discount of finite discounted cost Markov Decision Processes to “exponentially representable” discount functions, prove existence of optimal policies which are stationary from some time N onward, and provide an algorithm for their computation. Outside this class, optimal “N-stationary” policies in general do not exist.

متن کامل

Persistently Optimal Policies in Stochastic Dynamic Programming with Generalized Discounting

In this paper we study a Markov decision process with a non-linear discount function. Our approach is in spirit of the von Neumann-Morgenstern concept and is based on the notion of expectation. First, we define a utility on the space of trajectories of the process in the finite and infinite time horizon and then take their expected values. It turns out that the associated optimization problem l...

متن کامل

Markov Decision Processes with General Discount Functions

In Markov Decision Processes, the discount function determines how much the reward for each point in time adds to the value of the process, and thus deeply a ects the optimal policy. Two cases of discount functions are well known and analyzed. The rst is no discounting at all, which correspond to the totaland average-reward criteria. The second case is a constant discount rate, which leads to a...

متن کامل

On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes

We consider infinite-horizon stationary γ-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error ǫ at each iteration, it is well-known that one can compute stationary policies that are 2γ (1−γ)2 ǫ-optimal. After arguing that this guarantee is tight, we develop variations of Value and Policy Iter...

متن کامل

Numerical Analysis of Non-Constant Discounting with an Application to Renewable Resource Management

The possibility of non-constant discounting is important in environmental and resource management problems where current decisions affect welfare in the far-distant future, as with climate change. The difficulty of analyzing models with non-constant discounting limits their application. We describe and provide software to implement an algorithm to numerically obtain a Markov Perfect Equilibrium...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008